General idea is from dimensionality reduction - scaling approaches like NOMINATE / wordfish & variants all do some form of dimensionality reduction: Given MEPs, documents, words, votes as features - project points into a lower dimensional space - usually 1 or 2 dimensions.
While alternatives to W-NOMINATE aren't significantly "better" at clustering MEPs using their votes as features - are other methods useful?
Given MEPs and Voting records, there are lots of parameters and variations on what you can do:
Evaluating quality of approaches is with silhouette scores (measure of cluster quality): -1 for incorrect clustering and +1 for highly dense clustering. Scores around zero mean overlapping clusters.
from IPython.display import display, HTML, Image
W-NOMINATE: (Interpreted as Left / Right & Pro / Anti EU)
display(HTML(open('term7-wnominate-viz.html').read()))
display(HTML(open('term7-wnominate-score.html').read()))
Using the same voting records, but a different way of reducing dimensions with NMF:
Image(filename='term7-3d-nmf.png')
display(HTML(open('term7-nmf-score.html').read()))
Plotting 3d from NMF separately: (can these be interpreted similarly to W-NOMINATE?)
display(HTML(open('term7-count-nmf-xy.html').read()))
display(HTML(open('term7-count-nmf-xz.html').read()))
Using t-SNE dimensionality reduction on the Vote Matrix: t-sne x & y dimensions aren't meaningful in the same way as W-NOMINATE - but similar points (MEPs) should cluster together:
display(HTML(open('term7-count-tsne-plot.html').read()))
display(HTML(open('term7-tsne-score.html').read()))
Treating Yes / No / Abstain Votes and MEPs as "words" and "contexts" - word2vec can be used:
display(HTML(open('term7-sgns-tsne-sgns-plot.html').read()))
display(HTML(open('term7-tsne-sgns-score.html').read()))
Going from Votes to Word2Vec to t-sne doesn't cluster MEPs as well as Votes to tsne directly, but does produce an alternative view - but is this useful? (The advantage is that it's fast, but together with tsne is more unstable - different runs will produce similar clusters, but may be arranged differently by tsne)
W-NOMINATE Approach:
display(HTML(open('term6-wnominate-viz.html').read()))
W-NOMINATE Silhouette scores for groups:
display(HTML(open('term6-wnominate-score.html').read()))
Using NMF: (Not as clear as with 7th Term)
Image(filename='term6-3d-nmf.png')
display(HTML(open('term6-nmf-score.html').read()))
display(HTML(open('term6-count-nmf-xy.html').read()))
display(HTML(open('term6-count-nmf-xz.html').read()))
tsne on votes directly:
display(HTML(open('term6-count-tsne-c-plot.html').read()))
display(HTML(open('term6-tsne-c-score.html').read()))
Votes to word2vec to tsne:
With this approach - the socialists (S & D) group is split - this could be a visualisation artefact.
display(HTML(open('term6-sgns-tsne-plot.html').read()))
display(HTML(open('term6-tsne-sgns-score.html').read()))